Using AgreementMaker to align Ontologies for OAEI 2009: Overview, Results, and Outlook

نویسندگان

  • Isabel F. Cruz
  • Flavio Palandri Antonelli
  • Cosmin Stroe
  • Ulas C. Keles
  • Angela Maduko
چکیده

This paper describes our participation in the Ontology Alignment Evaluation Initiative (OAEI) 2009 with the AgreementMaker system for ontology matching, in which we obtained excellent results. In particular, we participated in the benchmarks, anatomy, and conference tracks. In the anatomy track, we competed against nine other systems in all four subtracks obtaining the best result in subtrack 3 and the second best result in subtracks 1 and 2. We were also first in finding the highest number of non-trivial correspondences. Furthermore, AgreementMaker came in first place among seven participants in the conference track and achieved the highest precision among all thirteen participating systems in the benchmarks track. In addition to presenting this year’s results, we give an overview of the AgreementMaker system, discuss ways in which we plan to further improve it in the future, and present suggestions for future editions of the OAEI competition. 1 Presentation of the system As the Semantic Web evolves, more and more ontologies are being developed to describe conceptually several domains of interest. Ontology matching or alignment, which involves the task of finding correspondences called mappings between semantically related entities in two different ontologies, is needed to realize semantic interoperation and heterogenous data integration. A matching is a set of mappings established between two ontologies: the source ontology and the target ontology. Automatic matching methods are highly desirable to allow for scalability both in the size and number of ontologies being aligned. Our collaboration with domain experts in the geospatial domain [7] has revealed that they value automatic matching methods, especially for ontologies with thousands of concepts. However, they want to be able to evaluate the matching process, thus requiring to be directly involved in the loop. Driven by these requirements, we have developed the AgreementMaker system that integrates efficient automatic matching ? Research supported by NSF Awards ITR IIS-0326284, IIS-0513553, and IIS-0812258. 1 www.AgreementMaker.org strategies with a multi-purpose user interface and a module to evaluate matchings [3]. The problem of finding matchings is challenging on several counts. For example, a particular matching method may be effective for a given scenario, but not for others. Also, within the same scenario, the use of different parameters can change the outcome significantly. Therefore, our framework introduces a combined approach that takes advantage of several matching techniques focusing on different features of the ontologies and that allows for different parameters to be set. In particular, our architecture allows for serial and parallel composition where the output of one or more methods can be used as input to another method or several methods can be used on the same input and then combined. A set of mappings may therefore be the result of a sequence of steps called layers. The motivation behind this framework is to provide the capability of combining as many mapping layers as needed in order to capture a wide range of relationships between concepts in real-world scenarios [1]. There are parameters that can be defined for all methods, such as cardinality and threshold, whereas other parameters are method dependent. The parameter values can be set manually by the user or by automatic methods that take into account quality measures [2]. We have been developing AgreementMaker since 2001, with a focus on realworld applications [5, 8] and in particular on geospatial applications [4, 6, 7, 9– 12, 16]. However, the current version of AgreementMaker and its implementation represents a whole new effort. Not only have we added significant new aspects to the system, but we also have almost completely reimplemented it in the last year. For example, in September of 2008 the previous implementation consisted of 9,000 lines of Java code, whereas in September of 2009 the new implementation had 29,000 lines. The new AgreementMaker system [1–3] supports: (1) user requirements, as expressed by domain experts; (2) a wide range of input (ontology) and output (agreement file) formats; (3) a large choice of matching methods depending, on the different granularity of the set of components being matched (local vs. global), on different features considered in the comparison (conceptual vs. structural), on the amount of intervention that they require from users (manual vs. automatic), on usage (standalone vs. composed), and on the types of components to consider (schema only or schema and instances); (4) improved performance, that is, accuracy (precision, recall, F-measure) and efficiency (execution time) for the automatic methods; (5) an extensible architecture to incorporate new methods easily and to tune their performance; (6) the capability to evaluate, compare, and combine different strategies and matching results; (7) a comprehensive user interface that supports advanced visualization techniques and a control panel that drives all the matching methods and evaluation strategies; (8) a feedback loop that accepts suggestions and corrections by users and extrapolates new mappings. 1.1 State, purpose, general statement AgreementMaker comprises a wide range of automatic matching algorithms called matchers, an extensible and modular architecture, a multi-purpose user interface, a set of evaluation strategies, and various manual (e.g., visual comparison) and semi-automatic features (e.g., user feedback loop). Given the automatic processing requirement imposed by OAEI, we could mainly make use of the first two features. In particular, we adopted seven different matchers for the competition and took advantage of the modular architecture to organize those matchers into four different matching layers. The evaluation techniques came into play only in the combination phase, to disambiguate the quality of the mappings to be selected. Even though we could not take direct advantage of the user interface of AgreementMaker in the competition, we want to highlight its benefits prior to the competition. For example, the user interface can display any ontology (the largest ones we have tested have 30,000 concepts), therefore we were able to display the OAEI ontologies to investigate their characteristics (see Figure 1). In addition, we could test, tune, and evaluate both the individual matchers and the particular composition of matchers that we used in the competition. Fig. 1. Graphical User Interface of the AgreementMaker displaying ontologies from the benchmarks track. 1.2 Specific techniques used For the OAEI 2009 competition, we have created a stack of matchers, shown in Figure 2, which are run on the input ontologies to compute the final alignment set. First, three string-based techniques are independently run on the input ontologies: the Base Similarity Matcher (BSM) [7], the Parametric String-based Matcher (PSM) [2], and the Vector-based Multi-word Matcher (VMM) [2]. Fig. 2. AgreementMaker OAEI 2009 matcher stack. BSM is a fundamental string-based matcher, which uses rule-based word stemming, stop word removal, and word normalization in order to find mappings. Going beyond the capabilities of BSM, PSM combines an edit distance measure and a substring measure in order to find mappings. Specifically for this campaign, PSM uses the following formula: σ(a, b) = 0.6 ∗ substring(a, b) + 0.4 ∗ edit distance(a, b) Our last string similarity matcher, VMM, compiles a virtual document for every concept of an ontology, then transforms the strings into TF-IDF vectors and computes the similarity using the cosine similarity measure. After running the string matchers in parallel, their results are combined using the Linear Weighted Combination (LWC) matcher [2]. The LWC matcher uses the formula: σLWC(a, b) = wBSM ∗ σBSM (a, b) + wPSM ∗ σPSM (a, b) + wV MM ∗ σV MM (a, b) where the weights for each similarity are automatically calculated using the local-confidence quality measure. After the LWC matcher runs, we have a single, combined set of alignments that includes the best alignments from each of the string-based methods. The next matcher, the Descendant’s Similarity Inheritance (DSI) [7] matcher, is a structure-based matcher that considers the ancestors of the concepts in a mapping in order to increase the similarity of the mapping. The DSI matcher is based on the following heuristic: if two nodes are matched with high similarity, then the similarity between the descendants of those nodes should increase. New mappings are created by the DSI matcher when the similarity of a mapping is increased beyond the threshold established for that matcher. The last step uses a lexical matcher, which considers not only the terms in an ontology, but also the synonyms of those terms as provided by a thesaurus (e.g., WordNet or UMLS). In order to take advantage of the unique nature of the conference track, we performed an extra computation step, which we used in a new configuration of AgreementMaker called AgreementMakerExt. The OAEI 2009 matcher stack described above considers only two ontologies at a time. In order to expand this consideration, we have added a step that tries to take advantage of the transitivity between ontology mappings. We call this computation the conflict resolution step. As shown in Figure 3, we consider two ontologies OA and OB , which have a mapping between them denoted mA↔B(ai ∈ OA, bj ∈ OB), given that concept ai ∈ OA has been matched to concept bj ∈ OB . We then consider a third ontology OC such that concept ai ∈ OA is mapped to some concept ck ∈ OC by mapping mA↔C(ai ∈ OA, ck ∈ OC). We also identify a mapping mB↔C(bj ∈ OB , ch ∈ OC) if there exists a concept ch ∈ OC that matches bj ∈ OB . Note that mA↔C and mB↔C may point to different concepts in OC (i.e., k 6= h). Fig. 3. Conflict resolution using a rating system. We now implement a rating system. If mA↔C and mB↔C both map to the same concept in OC (i.e., k = h), we increment the rating of all three mappings by 1. If mA↔C or mB↔C does not exist, we decrement the rating of any existing mappings by 1. Likewise, if mA↔C and mB↔C exist, but map to different concepts in OC (i.e., k 6= h), we decrement the rating of all three mappings. This rating is performed for all the mappings between all the ontologies. Finally, we then sweep through the rated mappings and modify the alignments between any two ontologies to choose the mappings that have been rated the highest, resolving any conflicts by choosing the mappings with highest similarity. 1.3 Link to the set of provided alignments (in align format) AgreementMaker alignment sets for OAEI can be found at http://www.AgreementMaker.org/OAEI09 Results.zip.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Using AgreementMaker to Align Ontologies for OAEI

The AgreementMaker system is unique in that it features a powerful user interface, a flexible and extensible architecture, an integrated evaluation engine that relies on inherent quality measures, and semi-automatic and automatic methods. This paper describes the participation of AgreementMaker in the 2011 OAEI competition in four tracks: benchmarks, anatomy, conference, and instance matching. ...

متن کامل

Testing the AgreementMaker System in the Anatomy Task of OAEI 2012

The AgreementMaker system was the leading system in the anatomy task of the Ontology Alignment Evaluation Initiative (OAEI) competition in 2011. While AgreementMaker did not compete in OAEI 2012, here we report on its performance in the 2012 anatomy task, using the same configurations of AgreementMaker submitted to OAEI 2011. Additionally, we also test AgreementMaker using an updated version of...

متن کامل

Using the AgreementMaker to Align Ontologies for the OAEI Campaign 2007

Ontology matching, the task of finding the correspondences that exist between concepts in two different ontologies, is a promising solution to the semantic heterogeneity problem that is faced by the Semantic Web community. This paper describes AgreementMaker, a system for automatically aligning two ontologies. The AgreementMaker system comprises an extensible architecture that allows for the in...

متن کامل

AgreementMaker: Efficient Matching for Large Real-World Schemas and Ontologies

We present the AgreementMaker system for matching realworld schemas and ontologies, which may consist of hundreds or even thousands of concepts. The end users of the system are sophisticated domain experts whose needs have driven the design and implementation of the system: they require a responsive, powerful, and extensible framework to perform, evaluate, and compare matching methods. The syst...

متن کامل

Using AgreementMaker to align ontologies for OAEI 2010

The AgreementMaker system is unique in that it features a powerful user interface, a flexible and extensible architecture, an integrated evaluation engine that relies on inherent quality measures, and semi-automatic and automatic methods. This paper describes the participation of AgreementMaker in the 2010 OAEI competition in three tracks: benchmarks, anatomy, and conference. After its successf...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009